Distributional Consistency: As a General Method for Defining a Core Lexicon
نویسندگان
چکیده
We propose Distributional Consistency (DC) as a general method for defining a Core Lexicon. The property of DC is investigated theoretically and empirically, showing that it is clearly distinguishable from word frequency and range of distribution. DC is also shown to reflect intuitive interpretations, especially when its value is close to 1. Its immediate application in NLP would include defining a core lexicon in a language and identifying topical words in a document. We also categorize the existent measures of dispersion into 3 groups via ratio of norm or entropy, proposed a simplified measure and a combined kind of measure. These new measures can be used as virtual prototype or medium type for the study and comparison of existent measures in the future.
منابع مشابه
Distributional Semantics and the Lexicon
The lexicons used in computational linguistics systems contain morphological, syntactic, and occasionally also some semantic information (such as definitions, pointers to an ontology, verb frame filler preferences, etc.). But the human cognitive lexicon contains a great deal more, crucially, expectations about how a word tends to combine with others: not just general information-extraction-like...
متن کاملImam Sadegh’s (AS) Hadiths in Sunni’s lexicon
The Quran and Hadiths including Infallibles (AS) Hadiths such as Imam Sadegh (AS) were one of compilation references, and also, one of the fields of research for Arabs morphologists from long time ago. Imam Sadegh’s (AS) Hadiths based on Sunni’s lexicon, and then, based on another Islamic science books will be illustrated in this research in order to identify where these Hadiths hav...
متن کاملA Bayesian Framework for Learning Words From Multiword Utterances
Current computational models of word learning make use of correspondences between words and observed referents, but as of yet cannot—as human learners do—leverage information regarding the meaning of other words in the lexicon. Here we develop a Bayesian framework for word learning that learns a lexicon from multiword utterances. In a set of three simulations we demonstrate this framework’s fun...
متن کاملA New Semantics: Merging Propositional and Distributional Information
Despite hundreds of years of study on semantics, theories and representations of semantic content—the actual meaning of the symbols used in semantic propositions—remain impoverished. The traditional extensional and intensional models of semantics are difficult to actually flesh out in practice, and no large-scale models of this kind exist. Recently, researchers in Natural Language Processing (N...
متن کاملImplementing a Reverse Dictionary, based on word definitions, using a Node-Graph Architecture
In this paper, we outline an approach to build graph-based reverse dictionaries using word definitions. A reverse dictionary takes a phrase as an input and outputs a list of words semantically similar to that phrase. It is a solution to the Tip-of-the-Tongue problem. We use a distance-based similarity measure, computed on a graph, to assess the similarity between a word and the input phrase. We...
متن کامل